CHAPTER 3 Data Mining in Scientific Data
نویسندگان
چکیده
Knowledge discovery in scientific data, i.e. the extraction of engineering knowledge in form of a mathematical model description from experimental data, is currently an important part in the industrial re-engineering effort for an improved knowledge reuse. Despite the fact that large collections of data have been acquired in expensive investigations from numerical simulations and experiments in the past, the systematic use of data mining algorithms for the purpose of knowledge extraction from data is still in its infancy. In contrary to other data sets collected in business and finance, scientific data possess additional properties special to their domain of origin. First, the principle of cause and effect has a strong impact and implies the completeness of the parameter list of the unknown functional model more rigorous than one would assume in other domains, such as in financial credit-worthiness data or client behavior analyses. Secondly, scientific data are usually rich in physical unit information which represents an important piece of structural knowledge in the underlying model formation theory in form of dimensionally homogeneous functions. Based on these features of scientific data, a similarity transformation using the measurement unit information of the data can be performed. This similarity transformation eliminates the scale-dependency of the numerical data values and creates a set of dimensionless similarity numbers. Together with reasoning strategies from artificial intelligence such as case-based reasoning, these 62 DATA MINING FOR DESIGN AND MANUFACTURING similarity numbers may be used to estimate many engineering properties of the technical object or process under consideration. Furthermore, the employed similarity transformation usually reduces the remaining complexity of the resulting unknown similarity function which can be approximated using different techniques.
منابع مشابه
Parallel and Distributed Data Mining: An Introduction
The explosive growth in data collection in business and scientific fields has literally forced upon us the need to analyze and mine useful knowledge from it. Data mining refers to the entire process of extracting useful and novel patterns/models from large datasets. Due to the huge size of data and amount of computation involved in data mining, high-performance computing is an essential compone...
متن کاملIntelligent Approaches to Mining the Primary Research Literature: Techniques, Systems, and Examples
In this chapter, we describe how creating knowledge bases from the primary biomedical literature is formally equivalent to the process of performing a literature review or a ‘research synthesis’. We describe a principled approach to partitioning the research literature according to the different types of experiments performed by researchers and how knowledge engineering approaches must be caref...
متن کاملConcept Formation in Scientific Knowledge Discovery from a Constructivist View
This chapter argues that the computer-aided scientific knowledge discovery tool should facilitate scientific knowledge development through assisting scientists to build first-person knowledge and thirdperson knowledge. The chapter reviews cognitive theories of human knowledge construction and presents a hydrological modelling scenario as an exemplar of these concepts. A number of challenges for...
متن کاملToward Automatic Annotation of Genes and Proteins
This chapter introduces the use of Text Mining in scientific literature for biological research, with a special focus on automatic gene and protein annotation. This field became recently a major topic in Bioinformatics, motivated by the opportunity brought by tapping the BioLiterature with automatic text processing software. The chapter describes the main approaches adopted and analyzes systems...
متن کاملComputational Discovery of Scientific Knowledge
This chapter introduces the field of computational scientific discovery and provides a brief overview thereof. We first try to be more specific about what scientific discovery is and also place it in the broader context of the scientific enterprise. We discuss the components of scientific behavior, that is, the knowledge structures that arise in science and the processes that manipulate them. W...
متن کاملChapter 16 Mining Encrypted Data
Business and scientific organizations, nowadays, own databases containing confidential information that needs to be analyzed, through data mining techniques, in order to support their planning activities. The need for privacy is imposed due to, either legal restrictions (for medical and socio-economic databases), or the unwillingness of business organizations to share their data which are consi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001